161 research outputs found
Distilling Causal Effect from Miscellaneous Other-Class for Continual Named Entity Recognition
Continual Learning for Named Entity Recognition (CL-NER) aims to learn a
growing number of entity types over time from a stream of data. However, simply
learning Other-Class in the same way as new entity types amplifies the
catastrophic forgetting and leads to a substantial performance drop. The main
cause behind this is that Other-Class samples usually contain old entity types,
and the old knowledge in these Other-Class samples is not preserved properly.
Thanks to the causal inference, we identify that the forgetting is caused by
the missing causal effect from the old data. To this end, we propose a unified
causal framework to retrieve the causality from both new entity types and
Other-Class. Furthermore, we apply curriculum learning to mitigate the impact
of label noise and introduce a self-adaptive weight for balancing the causal
effects between new entity types and Other-Class. Experimental results on three
benchmark datasets show that our method outperforms the state-of-the-art method
by a large margin. Moreover, our method can be combined with the existing
state-of-the-art methods to improve the performance in CL-NERComment: Accepted by EMNLP202
From Bitcoin to Solana -- Innovating Blockchain towards Enterprise Applications
This survey presents a comprehensive study of recent advances in block-chain
technologies, focusing on how issues that affecting the enterprise adoption
were progressively addressed from the original Bitcoin system to Ethereum, to
Solana etc. Key issues preventing the wide adoption are scala-bility and
performance, while recent advances in Solana has clearly demon-strated that it
is possible to significantly improve on those issues by innovat-ing on data
structure, processes and algorithms by consolidating various time-consuming
algorithms and security enforcements, and differentiate and balance users and
their responsibilities and rights, while maintaining the re-quired security and
integrity that blockchain systems inherently offer
Multi-view 3D Face Reconstruction Based on Flame
At present, face 3D reconstruction has broad application prospects in various
fields, but the research on it is still in the development stage. In this
paper, we hope to achieve better face 3D reconstruction quality by combining
multi-view training framework with face parametric model Flame, propose a
multi-view training and testing model MFNet (Multi-view Flame Network). We
build a self-supervised training framework and implement constraints such as
multi-view optical flow loss function and face landmark loss, and finally
obtain a complete MFNet. We propose innovative implementations of multi-view
optical flow loss and the covisible mask. We test our model on AFLW and
facescape datasets and also take pictures of our faces to reconstruct 3D faces
while simulating actual scenarios as much as possible, which achieves good
results. Our work mainly addresses the problem of combining parametric models
of faces with multi-view face 3D reconstruction and explores the implementation
of a Flame based multi-view training and testing framework for contributing to
the field of face 3D reconstruction
Too Large; Data Reduction for Vision-Language Pre-Training
This paper examines the problems of severe image-text misalignment and high
redundancy in the widely-used large-scale Vision-Language Pre-Training (VLP)
datasets. To address these issues, we propose an efficient and straightforward
Vision-Language learning algorithm called TL;DR, which aims to compress the
existing large VLP data into a small, high-quality set. Our approach consists
of two major steps. First, a codebook-based encoder-decoder captioner is
developed to select representative samples. Second, a new caption is generated
to complement the original captions for selected samples, mitigating the
text-image misalignment problem while maintaining uniqueness. As the result,
TL;DR enables us to reduce the large dataset into a small set of high-quality
data, which can serve as an alternative pre-training dataset. This algorithm
significantly speeds up the time-consuming pretraining process. Specifically,
TL;DR can compress the mainstream VLP datasets at a high ratio, e.g., reduce
well-cleaned CC3M dataset from 2.82M to 0.67M (24\%) and noisy YFCC15M
from 15M to 2.5M (16.7\%). Extensive experiments with three popular VLP
models over seven downstream tasks show that VLP model trained on the
compressed dataset provided by TL;DR can perform similar or even better results
compared with training on the full-scale dataset. The code will be made
available at \url{https://github.com/showlab/data-centric.vlp}.Comment: Work in progress. Code: https://github.com/showlab/data-centric.vl
Free-ATM: Exploring Unsupervised Learning on Diffusion-Generated Images with Free Attention Masks
Despite the rapid advancement of unsupervised learning in visual
representation, it requires training on large-scale datasets that demand costly
data collection, and pose additional challenges due to concerns regarding data
privacy. Recently, synthetic images generated by text-to-image diffusion
models, have shown great potential for benefiting image recognition. Although
promising, there has been inadequate exploration dedicated to unsupervised
learning on diffusion-generated images. To address this, we start by uncovering
that diffusion models' cross-attention layers inherently provide
annotation-free attention masks aligned with corresponding text inputs on
generated images. We then investigate the problems of three prevalent
unsupervised learning techniques ( i.e., contrastive learning, masked modeling,
and vision-language pretraining) and introduce customized solutions by fully
exploiting the aforementioned free attention masks. Our approach is validated
through extensive experiments that show consistent improvements in baseline
models across various downstream tasks, including image classification,
detection, segmentation, and image-text retrieval. By utilizing our method, it
is possible to close the performance gap between unsupervised pretraining on
synthetic data and real-world scenarios
Dataset Condensation via Generative Model
Dataset condensation aims to condense a large dataset with a lot of training
samples into a small set. Previous methods usually condense the dataset into
the pixels format. However, it suffers from slow optimization speed and large
number of parameters to be optimized. When increasing image resolutions and
classes, the number of learnable parameters grows accordingly, prohibiting
condensation methods from scaling up to large datasets with diverse classes.
Moreover, the relations among condensed samples have been neglected and hence
the feature distribution of condensed samples is often not diverse. To solve
these problems, we propose to condense the dataset into another format, a
generative model. Such a novel format allows for the condensation of large
datasets because the size of the generative model remains relatively stable as
the number of classes or image resolution increases. Furthermore, an
intra-class and an inter-class loss are proposed to model the relation of
condensed samples. Intra-class loss aims to create more diverse samples for
each class by pushing each sample away from the others of the same class.
Meanwhile, inter-class loss increases the discriminability of samples by
widening the gap between the centers of different classes. Extensive
comparisons with state-of-the-art methods and our ablation studies confirm the
effectiveness of our method and its individual component. To our best
knowledge, we are the first to successfully conduct condensation on
ImageNet-1k.Comment: old work,done in 202
Synthesis and Catalytic Activity of Iron Hydride Ligated with Bidentate N-Heterocyclic Silylenes for Hydroboration of Carbonyl Compounds
We report the synthesis
of a novel bidentate N-heterocyclic silylene
(NHSi) ligand, N-(LSi:)-N-methyl-2-pyridinamine
(1) (L = PhCÂ(NtBu)2), and
the first bischelate disilylene iron hydride, [(Si,N)Â(Si,C)ÂFeÂ(H)Â(PMe3)] (2), and monosilylene iron hydride, [(Si,C)ÂFeÂ(H)Â(PMe3)3] (2′), through Csp2–H activation of the NHSi ligand. Compounds 1 and 2 were fully characterized by spectroscopic
methods and single-crystal X-ray diffraction analysis. Density functional
theory calculations indicated the multiple-bond character of the Fe–Si
bonds and the Ï€ back-donation from FeÂ(II) to the SiÂ(II) center.
Moreover, the strong donor character of ligand 1 enables 2 to act as an efficient catalyst for the hydroboration reaction
of carbonyl compounds at room temperature. Chemoselective hydroboration
is attained under these conditions. This might be the first example
of hydroboration of ketones and aldehydes catalyzed by a silylene
hydrido iron complex. A catalytic mechanism was suggested and partially
experimentally verified
- …